Goto

Collaborating Authors

 global inhibition


Minion Gated Recurrent Unit for Continual Learning

Zyarah, Abdullah M., Kudithipudi, Dhireesha

arXiv.org Artificial Intelligence

The increasing demand for continual learning in sequential data processing has led to progressively complex training methodologies and larger recurrent network architectures. Consequently, this has widened the knowledge gap between continual learning with recurrent neural networks (RNNs) and their ability to operate on devices with limited memory and compute. To address this challenge, we investigate the effectiveness of simplifying RNN architectures, particularly gated recurrent unit (GRU), and its impact on both single-task and multitask sequential learning. We propose a new variant of GRU, namely the minion recurrent unit (MiRU). MiRU replaces conventional gating mechanisms with scaling coefficients to regulate dynamic updates of hidden states and historical context, reducing computational costs and memory requirements. Despite its simplified architecture, MiRU maintains performance comparable to the standard GRU while achieving 2.90x faster training and reducing parameter usage by 2.88x, as demonstrated through evaluations on sequential image classification and natural language processing benchmarks. The impact of model simplification on its learning capacity is also investigated by performing continual learning tasks with a rehearsal-based strategy and global inhibition. We find that MiRU demonstrates stable performance in multitask learning even when using only rehearsal, unlike the standard GRU and its variants. These features position MiRU as a promising candidate for edge-device applications.


Reviews: Neural networks grown and self-organized by noise

Neural Information Processing Systems

The main contributions of this paper are to propose an algorithm to learn a pooling architecture and one to grow the architecture with only self-organization principles. The developmental algorithm is evaluated on a different input geometry and on experiments with faults in the first layer. A last experiment evaluates the proposed algorithms on a MNIST classification task. I like the originality of the work, as the authors propose the principle of a growing machine, that is able to yield a functional architecture from a limited set of rules. The principles to follow for building such self-organized network are clearly exposed.


Unsupervised Learning by Competing Hidden Units

Krotov, Dmitry, Hopfield, John

arXiv.org Machine Learning

It is widely believed that the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility, and which is motivated by Hebb's idea that change of the synapse strength should be local - i.e. should depend only on the activities of the pre and post synaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer, and is capable of learning early feature detectors in a completely unsupervised way. These learned lower layer feature detectors can be used to train higher layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm.


Context dependent amplification of both rate and event-correlation in a VLSI network of spiking neurons

Chicca, Elisabetta, Indiveri, Giacomo, Douglas, Rodney J.

Neural Information Processing Systems

Cooperative competitive networks are believed to play a central role in cortical processing and have been shown to exhibit a wide set of useful computational properties. We propose a VLSI implementation of a spiking cooperative competitive network and show how it can perform context dependent computation both in the mean firing rate domain and in spike timing correlation space. In the mean rate case the network amplifies the activity of neurons belonging to the selected stimulus and suppresses the activity of neurons receiving weaker stimuli. In the event correlation case, the recurrent network amplifies with a higher gain the correlation between neurons which receive highly correlated inputs while leaving the mean firing rate unaltered. We describe the network architecture and present experimental data demonstrating its context dependent computation capabilities.


Context dependent amplification of both rate and event-correlation in a VLSI network of spiking neurons

Chicca, Elisabetta, Indiveri, Giacomo, Douglas, Rodney J.

Neural Information Processing Systems

Cooperative competitive networks are believed to play a central role in cortical processing and have been shown to exhibit a wide set of useful computational properties. We propose a VLSI implementation of a spiking cooperative competitive network and show how it can perform context dependent computation both in the mean firing rate domain and in spike timing correlation space. In the mean rate case the network amplifies the activity of neurons belonging to the selected stimulus and suppresses the activity of neurons receiving weaker stimuli. In the event correlation case, the recurrent network amplifies with a higher gain the correlation between neurons which receive highly correlated inputs while leaving the mean firing rate unaltered. We describe the network architecture and present experimental data demonstrating its context dependent computation capabilities.


Attentional Processing on a Spike-Based VLSI Neural Network

Wang, Yingxue, Douglas, Rodney J., Liu, Shih-Chii

Neural Information Processing Systems

The neurons of the neocortex communicate by asynchronous events called action potentials (or'spikes'). However, for simplicity of simulation, most models of processing by cortical neural networks have assumed that the activations of their neurons can be approximated by event rates rather than taking account of individual spikes. The obstacle to exploring the more detailed spike processing of these networks has been reduced considerably in recent years by the development of hybrid analog-digital Very-Large Scale Integrated (hVLSI) neural networks composed of spiking neurons that are able to operate in real-time. In this paper we describe such a hVLSI neural network that performs an interesting task of selective attentional processing that was previously described for a simulated'pointer-map' rate model by Hahnloser and colleagues. We found that most of the computational features of their rate model can be reproduced in the spiking implementation; but, that spike-based processing requires a modification of the original network architecture in order to memorize a previously attended target.


Context dependent amplification of both rate and event-correlation in a VLSI network of spiking neurons

Chicca, Elisabetta, Indiveri, Giacomo, Douglas, Rodney J.

Neural Information Processing Systems

Cooperative competitive networks are believed to play a central role in cortical processing and have been shown to exhibit a wide set of useful computational properties. We propose a VLSI implementation of a spiking cooperative competitive networkand show how it can perform context dependent computation both in the mean firing rate domain and in spike timing correlation space. In the mean rate case the network amplifies the activity of neurons belonging to the selected stimulus and suppresses the activity of neurons receiving weaker stimuli. In the event correlation case, the recurrent network amplifies with a higher gain the correlation betweenneurons which receive highly correlated inputs while leaving the mean firing rate unaltered. We describe the network architecture and present experimental datademonstrating its context dependent computation capabilities.


Attentional Processing on a Spike-Based VLSI Neural Network

Wang, Yingxue, Douglas, Rodney J., Liu, Shih-Chii

Neural Information Processing Systems

The neurons of the neocortex communicate by asynchronous events called action potentials (or'spikes'). However, for simplicity of simulation, most models of processing by cortical neural networks have assumed that the activations of their neurons can be approximated by event rates rather than taking account of individual spikes.The obstacle to exploring the more detailed spike processing of these networks has been reduced considerably in recent years by the development of hybrid analog-digital Very-Large Scale Integrated (hVLSI) neural networks composed ofspiking neurons that are able to operate in real-time. In this paper we describe sucha hVLSI neural network that performs an interesting task of selective attentional processing that was previously described for a simulated'pointer-map' rate model by Hahnloser and colleagues. We found that most of the computational features of their rate model can be reproduced in the spiking implementation; but, that spike-based processing requires a modification of the original network architecture inorder to memorize a previously attended target.